Performance of Inverted Indices in Distributed Text Document Retrieval Systems

نویسندگان

  • Anthony Tomasic
  • Hector Garcia-Molina
چکیده

The performance of distributed text document retrieval systems is strongly in uenced by the organization of the inverted index. This paper compares the performance impact on query processing of various physical organizations for inverted lists. We present a new probabilistic model of the database and queries. Simulation experiments determine which variables most strongly in uence response time and throughput. This leads to a set of design trade-o s over a wide range of hardware con gurations and new parallel query processing strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance of Inverted Indices in Shared - Nothing

The performance of distributed text document retrieval systems is strongly innuenced by the organization of the inverted index. This paper compares the performance impact on query processing of various physical organizations for inverted lists. We present a new prob-abilistic model of the database and queries. Simulation experiments determine which variables most strongly in-uence response time...

متن کامل

Unaligned Binary Codes for Index Compression in Schema-Independent Text Retrieval Systems

We examine index compression techniques for schemaindependent inverted files used in text retrieval systems. Schema-independent inverted files contain full positional information for all index terms and allow the structural unit of retrieval to be specified dynamically at query time, rather than statically during index construction. Schemaindependent indices have different characteristics than ...

متن کامل

SASE: Implementation of a Compressed Text Search Engine

Keyword based search engines are the basic building block of text retrieval systems. Higher level systems like content sensitive search engines and knowledgebased systems still rely on keyword search as the underlying text retrieval mechanism. With the explosive growth in content, Internet and Intranet information repositories require efficient mechanisms to store as well as index data. In this...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993